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ABSTRACT’'' 

A model  is  developed  for  a computer  communications  processor  with 
large  storage  capacity.  The  model  follows  single  message  positive 
acknowledgment  protocol  wherein  the  transmitting  node  acquires  mes- 
sages from  the  external  environment,  transmits  them  to  the  receiver, 
and  awaits  acknowledgment.  Special  attention  is  given  to  modeling  the 
effects  of  transmission  errors,  either  in  the  message  or  the  acknowl- 
edgment, upon  the  ser'vice  time  of  the  computer  communications  net- 
work. These  effects  of  error  have  generailly  been  ignored  in  pre'vious 
investigations. 

In  order  to  realistically  account  for  the  service  time  distributions  found 
in  actual  computing  systems,  which  have  bounded  domains,  an  amalysis 
of  M/G/1  queueing  servers  with  service  times  of  this  sort  was  neces- 
sary, This  analysis  (Appendix)  focuses  upon  servers  with  uniform  ser- 
vice times,  and  those  whose  service  time  distributions  can  be  approx- 
imated by  a section  of  a normal  distribution. 
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STORE-AND-FORWARD  COMPUTER  COMMUNICATIONS 
OVER  NOISY  CHANNELS 


I.  INTRODUCTION 

A.  Computer  Communication  Networks 

During  the  1960's,  computer  timesharing  systems  developed  from  experimental  prototypes 
into  sophisticated  tools  for  data  processing  and  for  research.  Specialized  hardware,  software, 
and  data  resources  were  available  only  to  users  at  the  sites  where  they  were  maintained,  while 
the  area  of  possible  utilization  of  these  resources  extended  far  beyond  any  one  installation's 
immediate  proximity.  A natural  demand  for  computer  communication  networks  had  arisen  by 
the  late  sixties  and  early  seventies  in  the  computing  research  community.  It  was  during  this 
same  period  (in  1969)  that  the  cost  of  the  computers  required  to  dynamically  allocate  communica- 
tion facilities  dropped  below  the  cost  of  the  facilities  being  allocated.  Thus,  for  the  first  time, 
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high-speed  networks  became  cost  effective.  * 
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The  ARPANET,  a generalized  experimental  computer  network  (shown  in  Fig.  I-l),  was  de- 
veloped and  expanded  in  the  late  sixties  and  seventies.  It  is  now  regarded  as  a mature  and  suc- 
cessful system  and  provides  daily  service  to  hundreds  of  users. 

Networking  has  seen  application  in  many  areas  other  than  computing  research.  Government 
and  military  users  have  a continuing  need  for  fast,  reliable  data  communications  between  a large 
number  of  points,  as  do  large  business  organizations,  in  retailing,  finance,  and  transportation. 
Electronic  funds  transfer  systems  will  increase  the  national  need  for  fast  efficient  communica- 
tions stm  further;  and  many  systems,  both  in  and  out  of  Government,  are  currently  being  de- 
ployed, developed,  or  planned  to  meet  these  varied  needs.  They  represent  a sizable  investment 
of  both  money  and  time,  an  investment  that  can  be  made  wisely  only  when  the  principles  of  opera- 
tion of  these  complex  systems  are  well  understood.  It  is  toward  the  furtherance  of  that  under- 
standing that  this  work  is  directed. 

B.  Structure  of  Store-and-Forward  Networks 

A store -and -forward  network  consists  of  a number  of  physically  separated  points  called 
nodes,  each  with  a certain  storage  capacity  and  communication  links  to  one  or  (usually)  more 
other  points.  Messages  enter  the  network  nodes  from  external  sources  and  the  communications 
processor  determines  which  of  its  neighbors  should  receive  each  message,  either  to  deliver  it 
to  a destination  attached  to  the  node  or  to  forward  it  to  another  node  which  will  either  deliver  it 
to  its  final  destination  or  forward  it  again  to  other  nodes  until  the  destination  is  reached.  The 

routing  decision  (which  neighbor  to  forward  to)  is  an  important  matter,  greatly  affecting  network 

4 5 

performance,  and  has  been  extensively  studied.  ’ 

When  a message  is  transmitted  to  another  node,  the  sender  retains  a copy  of  it  in  storage 
until  the  receiver  determines  by  any  of  seversil  error  detecting  schemes^  that  the  message  is 
correct  and  sends  back  a special  acknowledgment  message  to  the  transmitting  node,  thus  free- 
ing the  storage  used  to  retain  a copy  of  the  message,  which  would  be  retransmitted  if  the  ac- 
knowledgment were  not  received  within  a predetermined  time-out  interval.  Errors  that  occur 
due  to  the  presence  of  noise  of  any  real  communications  channel  may  be  found  in  the  acknowl- 
edgments as  well  as  the  basic  message  traffic.  These  acknowledgment  errors  will  result  in  the 
retention  in  storage  and  retransmission  of  messages  that  have  actually  been  correctly  received. 
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! C.  Nodal  Blocking 

If  messages  are  arriving  at  a node  at  a sufficiently  fast  rate  or  errors  are  requiring  that 
messages  be  retained  for  long  periods  of  time,  it  is  possible  for  all  of  a node's  message  storage 
space  to  become  full.  Each  message  requires  a buffer,  and  if  none  are  available,  the  node  is 
said  to  be  blocked  to  incoming  message  traffic.  Buffer  space  is  always  reserved,  however,  for 
acknowledgments  and  certain  other  high-priority  types  of  messages.  This  is  necessary  so  a 
blocked  node  can  become  unblocked  by  receiving  acknowledgments  and  resuming  normal  opera- 
I tions.  When  the  node  is  blocked,  arriving  messages  are  ignored,  and  therefore  not  acknowl- 

; edged.  The  sender  will,  after  the  prescribed  time  interval,  retransmit  and  wait  for  an  acknowl- 

j edgment.  Thus,  nodes  transmitting  to  a blocked  node  will  be  retaining  more  of  their  messages 

for  retransmission  and  are  thus  more  likely  to  become  blocked  themselves. 

D.  Previous  Work 

1 The  manner  in  which  nodes  become  blocked  due  to  stochastic  message  arrivals  and  service 
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t completions  was  analyzed  by  Schweitzer,  Lam,  and  Closs  ’ ’ by  modeling  the  network  as  a 

network  of  simple  queueing  servers.  They  consider  transmission  errors  and  nodal  blocking  to 
be  independent  events,  a simplifying  but  incorrect  assumption.  Their  results  are  useful  in  the 
analysis  of  networks  containing  very  high  quality  communications  channels.  They  employ  an 
incidence  matrix  characterization  of  network  routing  structure  that  greatly  simplifies  calcula- 
tion of  the  loading  on  individual  nodes.  The  lack  of  consideration  of  errors  in  acknowledgments 
is  a serious  flaw  in  this  model,  though  in  quiet  environments  it  does  provide  a useful  tool  for 
evaluating  the  stochastic  effects  of  arrival  rates  and  communications  service  times. 

Propagation  of  blocking  in  a network  was  studied  by  Ziegler,^®  who  modeled  each  node  as  a 
two-state  (either  blocked  or  free)  Markovian  system  and  studied,  both  analytically  and  by  means 
of  simulation,  the  spread  of  blocking  in  the  net.  He  found  that,  as  one  would  suspect,  the  spread 
is  very  rapid  and  can  quickly  reduce  network  throughput  to  zero.  This  was  without  any  consider- 
ation of  the  effects  of  errors. 

The  severity  of  the  blocking  problem  has  caused  designers  of  actual  networks  to  go  to  great 
. lengths  to  circumvent  it.  Most  involve  cutting  off  incoming  traffic  at  the  source  when  potential 

blocking  situations  develop,  others  employ  reservation  schemes  to  avoid  such  situations  in  ad- 
vance. A survey  of  these  flow  control  techniques  can  be  found  in  Ref.  11.  While  the  methods 
discussed  do  avoid  blocking,  they  do  so  at  a high  price  in  additional  overhead.  It  is  important 
to  consider  the  operation  of  the  basic  communications  processes  being  used.  If  sound  engineering 
decisions  are  made  in  the  initial  network  design,  the  various  "fixes"  that  must  be  employed  in  a 
real  system  will  be  invoked  less  frequently.  System  resources  can  then  be  used  toward  produc- 
tive ends,  rather  than  to  support  internal  control  functions.  Most  of  the  analysis  of  network 
performance,  even  very  detailed  analyses,  do  not  adequately  address  the  problems  caused  by 
errors  in  the  transmission  and  acknowledgment  of  messages.  As  will  be  seen  in  the  following 
sections,  these  error  rates  have  a strong  effect  on  the  performance  of  store-and-forward  sys- 
tems. These  effects  are  basic  in  nature  and  not  the  result  of  any  particular  algorithm  or  im- 
plementation. The  goal  here  is  to  further  the  understanding  of  the  processes  underlying  the 
operation  of  communications  networks  in  order  to  make  [xjssible  system  designs  that  increase 
the  amount  of  useful  work  the  network  can  perform. 
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n.  THE  EXTENDED  STORAGE  MODEL  OF  A COMMUNICATIONS  NODE 

A.  General  Description  and  Architectural  Considerations 

Nearly  all  communications  processors  constructed  in  the  past  have  avoided  the  use  of 
rotational  storage  devices  for  reliability  reasons.  It  was  considered  highly  undesirable  to  in- 
clude mechanical  components  in  a device  designed  to  provide  uninterrupted  service  for  long 
periods  of  time.  Thus,  the  amount  of  memory  available  for  buffering  was  often  limited  by  both 
addressing  and  economic  constraints. 

With  the  increasing  availability  of  cheap  LSI  memories,  large  virtual  address  spaces,  and 

the  expected  appearance  of  economical  solid-state  mass- storage  devices  using  charge -coupled 
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devices,  magnetic  bubbles,  ’ or  electron  beam  technologies,  it  is  reasonable  to  anticipate 

future  communications  processor  designs  with  far  less  stringent  restrictions  on  buffer  space. 

A general  discussion  of  the  effect  of  these  new  technologies  on  computer  system  architectures 

is  found  in  Ref.  17. 

The  model  discussed  here  is  directly  applicable  to  such  large-memory  machines,  and  yields 
information  on  buffering  requirements  that  enables  one  to  determine  how  much  space  actually 
constitutes  a "large"  memory,  and  what  the  approximate  distribution  of  buffer  utilization  will  be. 

B.  Network  of  Queues  Model 

Figure  II-l  shows  the  network  of  queues  and  servers  which  constitutes  the  extended  storage 
model.  Messages  arrive  from  external  sources  with  rate  X and  queue  at  the  transmission  server 
X,  whose  service-time  distribution  has  probability  density  function  (pdf)  fj^(x).  After  transmis- 
sion, messages  that  arrive  without  error,  and  therefore  will  not  be  timed  out,  queue  at  the  ac- 
knowledgment server  A.  A fraction  p^  of  the  messages  transmitted  will  not  be  served  by  the 
acknowledgment  process  because  of  time  outs  of  errors  in  transmission  of  the  message. 


Fig.II-1.  The  extended  storage  model. 


If  f^(a)  is  a pdf  for  the  random  variable  A which  denotes  acknowledgment  service  time, 
and  T is  the  length  of  the  time-out  interval,  then  the  fraction  of  messages  that  are  timed  out. 
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(11-1) 


The  probability  of  a bit  error  on  the  outgoing  link  is  e^  and  the  message  length  is  1 (bits) 


so  the  fraction  of  messages  received  with  errors,  f^^,  is 


f = 1 - (1  - e, ) 
m 1 


(II-2) 

The  branching  probability  into  the  time-out  servers  (labeled  T on  Fig.  Il-l)  is  then 
>'l  ^m  + - ^m’^T 

indicating  that  all  messages  with  errors  and  those  without  errors  that  are  not  acknowledged 
quickly  enough  will  be  timed  out.  The  decision  tree  in  Fig.  II-2  illustrates  this  calculation. 

|l»4l90l»] 


> Pl=  'm+  I'-'J't 


Fig.II-2.  Tree  for  calculating  p^.  Events  are  denoted  above  branches, 
probabilities  are  below. 

The  functional  dependence  of  p^  on  f^,  e^,  and  T is  found  by  substituting  the  expressions  (Il-l) 
and  (-2)  into  (-3)  yielding 


= 1 - (1  -e,)'""[l-y^  f^(a)da] 


P,(f;,.e,.T)  = 1 - (1  -e^)'""  r 


f^(a)  da 


(11-4) 


(11-5) 


since  f^  is  a density  defined  over  a positive  domain. 

The  time-out  process  is  represented  by  the  infinite  parallel  set  of  non-queueing  servers 
labeled  T,  each  with  a deterministic  service  time  equal  to  the  time-out  interval  T.  This  rep- 
resentation is  preferable  to  a queueing  server  with  deterministic  service  time  since  real  time 
outs  do  proceed  in  parallel.  Two  messages  entering  the  parallel  servers  at  times  0 and  t 
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(where  t < T)  will  be  timed  out  correctly  at  times  T and  t + T ratlier  than  T and  2T,  as 
would  be  the  case  for  an  (initially  idle)  single  server.  All  timed-out  messages  proceed  back  into 
the  queue  at  server  X for  retransmission. 

Messages  not  destined  for  time  outs  follow  the  upper  branch  to  the  acknowledgment  server 
A with  probability  1 — p^.  The  distribution  of  service  times  at  this  server  is  ttie  conditional 
distribution  of  f^(a)  given  that  a is  less  than  the  time-out  interval  1'.  This  is  necessary  to 
account  for  those  messages  that  were  correctly  received  but  will  be  timed  out  and  follow  the 
lower  branch  to  the  time-out  servers.  Denoting  this  distribution  by  gy^(a)  we  have 


gA(a)  = 


fA(a) 


gA(a)  = 0 


for  a < T 


for  a ^ T 


(II -6) 


An  acknowledgment  will  be  sent  by  server  A over  a line  with  bit  error  rate  If  the 

acknowledgment  is  received  in  error,  the  message  will  be  retransmitted.  This  is  shown  as  the 
lower  branch  leaving  A.  with  probability  given  by 

I . 


P2  = 1 - (1  - e^) 


(11-7) 


where  f ^ is  the  length  of  the  acknowledgment  in  bits.  Messages  whose  acknowledgments  are 
received  without  error  proceed  up  the  second  branch  with  probability  1 — p^  and  leave  the 
system. 


III.  SOLUTION  OF  THE  EXTENDED  STORAGE  MODEL 


A.  Analysis 


Networks  of  the  type  described  in  the  previous  section  can  be  analyzed  by  using  the 
branching  probabilities  and  external  arrival  rate  to  compute  the  total  arrival  rate  to  each  ele- 


ment of  the  system  and  solving  for  each  element  separately.  This  method  is  known  to  be  exact 
for  networks  containing  only  exponential  servers^ and  has  been  widely  employed  in  the  analysis 


of  complex  computer  systems.^^’^*^  The  decomposition  method  has  recently  been  shown  to  be 
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accurate  when  applied  to  closed  queueing  networks  with  general  service-time  distributions, 
and  it  is  reasonable  to  expect  ever  greater  accuracy  when  solving  open  systems  (such  as  the 
one  described  in  the  previous  section)  since  there  is  less  coupling  between  the  various  servers.' 

Let  r^,  r^,  and  r^  denote  the  total  arrival  rates  at  the  transmission,  acknowledgment,  and 
time-out  servers,  respectively.  Then,  by  considering  the  steady  state  flows  shown  in  Fig.  II-l, 
we  have 
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r = X + r,  + P,r 
X t 2 a 


r,.  = P.r 
t lx 


and 


ra  = (l-Pi)rx 


(111-1) 


Solving  the  above  equations  for  r^,  r^,  and  r^  yields 

X 


’’x  ^ (P.  - 1)  P 


2 

XP 


Pj  + 1 


‘'t  = (P, 


DP^-Pi  + l 


and 


^a  ' 1 - P, 


(III-2) 


The  arrival  rates  for  the  two  queueing  servers  x and  A can  be  used  along  with  the  distri- 
butions g^(a)  and  to  find  the  mean  and  variance  of  the  distribution  of  the  number  of  cus- 
tomers in  the  queues  or  in  service  at  x and  A.  The  formulas  derived  in  the  Appendix  for  a 


class  of  finite-domained  distributions  will  be  used  for  this  purpose.  Let  us  denote  these  means 
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and  variances  by  N , N , cr^,  and  c 
a X a X 


The  time-out  servers  must  be  analyzed  separately.  Assuming  a Poisson  arrival  process 
with  rate  r^,  the  number  of  busy  servers  will  be  Poisson  distributed  with  mean  Tr^.  Thus,  we 


have 


Nt  = Tr^ 


(in-3) 


and  since  the  variance  of  a Poisson  random  variable  is  equal  to  its  mean 
2 


V = 


(II1-4) 
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The  number  of  buffers  in  use  is 


with  mean 


N = N + N.  + N 
X t a 


N = N + N,  + N 

X t £ 


(III -6) 


and  variance 


2 2,2,2 
a.,  = (j  + a.  + a 
N X t a 


(II1-7) 


These  two  quantities  are  used  along  with  Chebychev's  inequality 


P [ I N - N|  > -5-  for  e > 0 (III- 

to  determine  how  many  buffers  are  required  to  assure  a minimum  blocking  probability 
This  number,  M,  is  found  from  (III-8)  to  be 


M = N+>^/P^.„  . 


B.  Effects  of  Error  and  Arrival  Rates  on  Queue  Size  and  Delay 


(III-9) 


The  results  derived  in  the  previous  sections  will  be  used  to  analyze  a specific  communica- 
tion process,  described  in  Table  III-l. 

The  error  and  arrival  rates  will  range  from  zero  to  those  values  beyond  which  the  system 
will  not  operate  (i.e.,  contain  only  finite  queues). 

An  important  parameter  in  the  extended  storage  model  is  P^,  the  branching  probability 
into  the  time-out  servers  T.  For  this  example,  we  have 

^ >200 
Pj  = 1 - (1  - e^) 

since  the  upper  bound  on  acknowledgment  service  times  is  less  than  the  time-out  interval  and 
the  integral  on  the  right-hand  side  of  Eq.  (11-5)  evaluates  to  one.  Figure  III-l  shows  a plot  of 
Pj  vs  error  rcte.  For  longer  messages  the  ascent  becomes  even  more  rapid.  The  branching 
probability  P^  out  of  the  acknowledgment  process  into  retransmission  is  similar  in  form.  The 
families  of  curves  shown  in  Figs.  III-2  and  -3  illustrate  the  behavior  of  a store-and-forward 
message  system  in  noisy  environments  ranging  from  an  extremely  high  quality  (a  bit  error  rate 
of  10  ^)  to  an  extreme  noise  level  (10^)  which  would  not  normally  be  encountered  in  a commer- 
cial application.  Note  from  Fig.  III-3  that  the  expected  delay  goes  up  in  high-noise  situations, 
even  at  very  low  traffic  rates.  This  is  due  to  the  near  certainty  of  retransmissions.  When  these 
branching  probabilities  are  used  with  Eqs.  (III-l)  through  (-6)  to  compute  the  expected  queue  size 
N as  a function  of  arrival  rate  for  a range  of  error  rates  corresponding  to  those  normally  found 
on  noisy  commercial  facilities  (see  Fig.  III-4),  the  behavior  of  N can  be  seen  in  Fig.  111-5.  I'hese 
graphs  show  the  entire  arrival  range  from  zero  to  beyond  the  saturation  point  wliere  infinite 
queueing  will  occur.  Note  how  this  point  moves  back  with  increases  in  error  rates.  A detailed 
view  of  a portion  of  Fig.  Ill- 5 is  found  in  Fig.  III-6,  showing  a more  reasonable  operating  range 


of  arrival  rates.  The  effects  on  message  delay  can  be  found  by  means  of  Little's  formula 
N = XW,  where  W is  the  expected  delay.  The  expected  waits  associated  with  the  queues  shown 
in  Figs,  ni-5  and  -6  are  shown,  respectively,  in  Figs.  III-7  and  -8. 

An  interesting  way  of  looking  at  the  error-rate  dependencies  shown  implicitly  in  Figs.  1II-5 
through  -8  is  to  hold  the  arrivals  constant  and  vary  instead  the  error  rate.  This  is  done  in 
F'igs.  I1I-9  (showing  the  size)  and  -10  (showing  delay),  which  clearly  show  the  existence  of  error 
level  thresholds,  corresponding  to  the  saturation  points  in  arrival  rates,  beyond  which  the  sys- 
tem cannot  be  operated.  These  are  shown  explicitly  in  Fig.  III-ll,  which  plots  the  maximum 
allowable  error  rate  against  the  arrival  rate. 
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Fig.  1II-9.  Mean  queue  length  vs  error 
rates,  zero  to  saturation,  for  a range  of 
arrival  rates  X = kX^  with  X^  = 1.0. 
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Fig.  III-IO.  Delay  vs  error  rates,  zero 
to  saturation,  with  arrival  rates  X as  in 
Fig.  111-9. 


Fig.  III-ll.  Maximum  operating  error 
rate  vs  arrival  rate. 
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C.  Bounds  on  Blocking  Probability 
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Equation  (III-9)  can  be  used  to  find  the  number  of  buffers  required  to  assure  some  specified 
blocking  probability.  This  relationship  is  shown,  for  a range  of  error  rates,  in  Fig.  III-12. 

Note  that  to  approach  zero  probability  of  blocking  an  asymptotically  infinite  number  of  buffers 
is  required.  These  curves  also  demonstrate  that  error  rate  has  a much  stronger  effect  on  the 
required  buffer  pool  size  when  very  low  blocking  probabilities  are  involved  than  when  more 
moderate  values  are  sought. 


Fig.  III-12.  Blocking  probability  vs  number 
of  buffers  for  a range  of  error  rates  e = ke_ 
with  Sq  = 0.0003.  ° 


ERROR  RATE 


Fig.  111-13.  Blocking  probability  vs  error 
rate  for  2 5,  50,  and  100  buffer  systems. 


By  looking  at  the  vertical  intercept  points  of  the  type  of  curves  shown  in  Fig.  111-12,  it  is 
possible  to  find  the  relationship  between  blocking  probability  and  error  rate  for  a fixed  number 
of  buffers.  The  result  when  this  is  done' for  systems  with  25,  50,  and  100  buffers  is  shown  in 
Fig.  III-13. 
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rv.  CONCLUSIONS 


A.  Error  Rates  and  Computer  Communications 

Error  probabilities  are  basic  properties  of  all  communications  systems.  They  are  of 

23  24 

fundamental  importance  in  the  classical  analyses  of  both  analog  and  digital  communications 
processes. 

A model  for  store-and-forward  communications  was  analyzed  with  full  consideration  of  er- 
ror rates,  and  strong  effects  on  system  performance  were  found  to  exist.  It  provides  a straight- 
forward way  of  showing  this  basic  dependence. 

The  extended  storage  model  makes  use  of  new  results  derived  from  the  PoUaczek-Khinchin 
formulas  for  M/G/I  queues  with  service-time  distributions  that  are  bounded  both  above  and  be- 
low (see  Appendix).  This  is  the  case  for  virtually  all  real  computer  systems,  but  has  often  been 
ignored  due  to  the  analytical  complexity  it  entails. 

The  existence  of  sharp  error-rate  thresholds,  analogous  to  the  arrival-rate  bounds  found 
in  this  research  and  by  others,  is  a clear  demonstration  of  an  important  effect  that  has  been  pre- 
viously overlooked.  The  existence  of  a region  of  the  error-arrival-rate  space  beyond  which 
communications  nodes  cannot  operate  without  incurring  infinite  queueing  delays  has  been  shown, 
along  with  a method  of  finding  this  region  with  relative  ease. 

The  option  of  allocating  resources  to  quieter  chauinels  rather  than  additional  memory  Of 
processing  capacity  is  often  overlooked.  Much  very  recent,  and  very  detailed,  work  on  com- 
puter communications  systems,  such  as  in  Ref.  25,  does  not  consider  the  effect  of  communica- 
tions errors  at  all.  Others,  as  in  Ref.  26,  have  looked  at  various  design  issues  with  full  consid- 
eration given  to  error  effects,  and  found  strong  dependencies  due,  in  part,  to  the  effects  observed 
here. 

Read,  computer  communications  systems  are,  of  course,  more  complex  than  the  models  that 
have  been  analyzed.  Each  will  differ  in  its  approach  (or  lack  of  one)  to  flow  control,  routing, 
reservations,  sequencing,  and  many  other  functions  performed  in  an  actual  system.  The  models 
developed  in  this  research  are  not  substitutes  for  simulation  or  actual  testing,  but  are  analytic 
tools  used  to  show  the  operating  characteristics  of  more  general  systems.  The  models,  free  of 
unnecessary  software  detail,  depend  only  on  fundamental  physical  parameters  and  yield  operating 
bounds  against  which  actual  or  proposed  designs  can  be  compared.  This  kind  of  analysis  has  an 
important  place  in  computer  communications  systems  anadysis  and  design. 

Further  work  in  this  area  could  include  considerations  of  the  bursty  nature  of  errors  on  ac- 
tual communications  facilities.  A more  detailed  exploration  of  the  multidimensional  space  of 
arrivals,  errors,  time-out  intervads,  message  size  distributions,  service  rates,  degree  of  pro- 
cessor paradlelism,  and  communications  protocols  could  be  performed  using  numerical  multi- 
variate optimization  techniques,  subject  to  differing  cost  constraints. 

B.  Use  of  Symbolic  Manipulation  Techniques 

MACSYMA,  the  recently  matured  symbolic  manipulation  system  at  the  M.I.T.  Artificial 
27 

Intelligence  Laboratory,  was  used  extensively  in  this  work,  particularly  in  the  Appendix. 

When  using  MACSYMA,  one  finds  the  distinction  between  numericad  and  anadyticad  solutions  be- 
comes somewhat  blurred.  The  symbolic  answers  to  some  relatively  simple  problems  can  easily 
fill  severad  pages,  and  their  usefulness  in  comparison  with  ordinary  numericad  methods  is 
questionable. 
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Intermediate  expression  swell,  that  demon  which  causes  symbolic  representations  to  ex- 
pand manyfold  during  processing  and  then  shrink  back  in  the  final  stages  of  a symbolic  operation, 
is  a paradigm  for  the  successful  use  of  the  symbolic  operations  (each  with  its  own  case  of  inter- 
I mediate  expression  swell).  Once  the  problem  lias  been  formulated,  the  best  use  of  the  symbolic 

’nanipulator  occurs  when  the  intermediate  results  of  the  calculation  are  large,  complex  objects 
wliich  we  could  not  normally  deal  with,  but  the  final  answers  take  a form  simple  enough  to  be 
' useful  and  comprehensible  as  algebraic  expressions.  This  was  the  fortunate  case  in  the  anal- 

ysis of  the  M/G/l  queueing  system  in  the  Appendix.  It  should  be  kept  in  mind  when  solving 
problems  symbolically  that  mistakes  are  as  easy  to  make  in  MACSYMA  as  on  paper,  though  they 
are  admittedly  of  a different  character.  A useful  technique  was  found  to  be  the  incorporation  of 
t checking  techniques  into  the  calculations  that  would  be  far  too  time  consuming  to  do  manually, 

I e.g. , making  sure  that  expressions  which  are  supposed  to  be  probability  densities  integrate  to 

( "1.0"  or  calculating  the  same  quantity  by  two  or  more  methods  and  comparing  the  results. 


1 
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APPENDIX 

M/G/1  QUEUEING  SERVERS  WITH  HOUNDED  SERVICE  TIME  DISTRIBUTIONS 


I.  INTRODUCTION 

Many  events  in  computing  systems  occur  over  a bounded  time  range.  Virtually  all 
processing  tasks  require  some  minimum  amount  of  time  to  accomplish  and,  barring  the 
occasional  software  idiosyncrasy  that  can  shunt  a task  off  into  the  low-priority  queue  for  a 
very  long  time,  each  task  will  be  completed  (or  abnormally  terminated)  within  some  reason- 
able maximum  time.  In  communications,  time-sharing,  and  other  real-time  systems,  the 
difference  between  these  upper  and  lower  bounds  will  often  be  small. 

Most  of  the  literature  on  queueing  theory  deals  with  servers  having  exponential  or  Erlang 
distributions,  both  defined  on  the  positive  infinite  half  interval. 

In  this  appendix,  M/G/l  queueing  servers  having  service-time  distributions  with  finite 
domains  will  be  considered,  and  the  first  two  moments  of  the  queue  size  distribution  derived. 


II.  THEORY  OF  M/G/l  SERVERS 

The  basis  for  the  analysis  of  M/G/l  systems  was  formulated  by  Pollaczek  and  Khinchin  in 
the  early  thirties;^ treatments  in  contemporary  notation  can  be  found  in  Ref.  2 (Vol.l)  and 
Ref.  30.  The  fundamental  results  will  be  stated  here.  Let  S,  denoting  the  service  time,  have 
a distribution  f{s)  with  finite  first,  second,  the  third  moments,  and  let  X denote  the  intensity 
of  the  Poisson  arrival  process.  Then,  taking  p = Xs',  the  mean  number  of  customers  in  .the 
system  is  given  by 


N = p + 


2T 
X s 

2(1  -p) 


The  z -transform  of  the  distribution  of  the  number  of  customers  is 


Q(z)  ^ B(X  -Xz) 


(1  -p)  (1  -z) 
B(X  — Xz)  — z 


where  B(X  — Xz)  refers  to  the  LaPlace  transform  of  the  service-time  distribution  B(w)  evaluated 
at  X — Xz  with 


e 


-wx 


f(x)  dx 


If  Q(z)  can  be  readily  inverted  or  written  as  a power  series,  the  distribution  of  the  number 
of  customers  can  be  found  directly,  but  for  most  cases  of  interest  this  is  not  possible.  The 
differentiation  properties  of  the  z-transform  of  a distribution  are  employed  to  find  the  moments. 
The  first  moment  (mean)  is  dQ(z)/dz  evaluated  at  z = 1.  (This  yields  the  same  result  as  the 
earlier  mean  value  formula.)  The  second  moment  is  derived  from  the  relation 


d^Q(z) 

dz^ 


= N^  - N 


I 

\ 

1 


I 


r 


1 


Since  z = 1 is  an  indeterminate  point  ofQ(z),  the  evaluation  above  will  require  the  repeated 
application  of  L'H6pital's  rule.  This  eventually  yields  the  desired  expression  for  the  variance 
of  the  distribution  of  the  number  of  customers  given  below. 


„2  , ( X^s^  \ , X*"  [(3  -2p)  s”^!  , ,, 

N ■ 3(1  - p)  ^ (2(1  - p)]  + Tli  - pi  + 

In  all  cases,  these  moments  exist  only  if  p < 1. 

Information  about  the  behavior  of  the  distribution  is  obtained  by  means  of  the  Chebyshev 
inequality,  which  states  that,  for  any  c > 0 

2 


.21 


Pr  [|N  - N|  ^ €]  < 


N 

~T 

£ 


III.  UNIFORMLY  DISTRIBUTED  SERVICE  TIMES 

The  uniform  distribution  is  useful  when  only  upper  and  lower  bounds  on  the  service  time 
are  known  and  no  statistical  information  on  its  behavior  between  these  bounds  is  available. 

Let  t'le  lower  bound  be  A and  the  upper  bound  be  B.  Then  the  distribution  takes  on  the  con- 
stant value  1/(B-A)  on  the  interval  [A,  B]  and  has  moments  given  below. 

S = (A  + B)/2 
S^  = (A^  + AB  + B^)/3 
S^  = (B  + A)  (B^  + A^)/4  . 

Substitution  of  these  expressions  into  the  formulas  of  the  previous  section  and  some  sim- 
plification yields,  for  the  mean  number  in  the  system, 

N = (B^  + 4AB  + A^)  X^  + (-6B  - 6A)  X 


~Xf6B  +TAr-  12 


for 


P = X(A  + B)/2  < 1 

In  the  case  where  A = 0 and  B = 1,  this  simplifies  further  to 

for  \ < Z 

The  variance  is  found  to  be 


N = (X^  - 6X)/(6X  - 12) 


= X^(Hb'’x^  + 22AB^X^  + 30A^B^X^  + 22A^BX^  + SA'^X^ 


-3b'‘x  -6AB^X  -12B^X  -6A^B^X  - 24AB^X  - 6A^BX 
- 24A^BX  - 54BX  - 3A^X  - 12A^X  - 54AX  + 6B^  + 6AB^ 
+ 6A^B  + 6A^  + 108)/(18(BX  + AX  - 2)^] 
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This  is  not  as  bad  as  it  looks  and  is  easily  coded  as  several  Fortran-like  statements. 
For  the  case  where  [A,  B)  = [0,  1]  it  simplifies  to 


2 


fix'*  - 69X^  + 114A^ 


18X  - 72A  + 72 

In  both  instances,  the  same  restrictions  on  p as  stated  for  the  mean  also  hold. 

For  the  special  case  of  constant  service  times  with  A = B = C,  the  resulting  expressions 


for  N and  are 


and 


^ . CX(CX  -2) 
■ 2(CX  - 1) 


„2  _ ISC^X"^  + (-40“*  - 12C^  -18C)  X^  + (4C^  + 18)  X^ 

‘^N  ' 

12(C  X"^  -2C  + 1) 

IV.  THE  BI-TRUNCATED  NORMAL  DISTRIBUTION 


^^4 


for  X < 1/C 


A wide  range  of  useful  distributions  is  obtained  by  bounding  a section  of  a normal  distribu- 
tion above  and  below  and  multiplying  by  an  appropriate  normalization  constant.  There  is  some 
intuitive  appeal  to  this  distribution.  One  form  of  the  central  limit  theorem  states  that  a random 
variable  that  is  the  sum  of  a large  number  of  independent  random  variables  with  finite  variances 
will  have  a normal  distribution.  The  response  of  a computer  system  to  a service  request  cer- 
tainly depends  on  a large  number  of  factors  and  the  finite  domain  properties  discussed  in  the 
first  section  assure  finite  variance.  Some  of  the  many  forms  of  distribution  in  this  family  are 
shown  in  Fig.A-1.  All  are  different  sections  of  a normal  distribution  with  mean  equal  to  10, 
only  the  variance  and  bounds  are  changed. 

The  probability  density  function  for  a normal  distribution  N(m,  a)  truncated  above  at  B 
and  below  at  A is 

1 2 2 

f(s)  = G 7 exp  [-(s  - m)  /2(t  ) for  A < s < B , 0 elsewhere 

where  G is  found  by  setting 
■>B 


C f(s)  ds  = 1 


to  be 


G = 


with 


then 


erf  1 

(m-A\ 

^/2  (7  / 

\ ‘JIc  I 

erf/ 

m — X \ 

\ 

^/2  a I 

G = 


g(A)  -g  (B) 
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The  first  three  moments  of  the  bi-truncateci  normal  distribution  BN(m,  <r.  A,  B)  where  m 
and  (7  are  the  mean  and  standard  deviation  of  the  underlying  normal  distribution  are  given  below. 

_ (g(B)  -g(A)]  + p -q) 

^ (g(B)  -g(A)| 


2 2 

iin  + z ) (g(B,  _ + B)  p + (M  + A)  ql 

i 


^ (g(B)  -g(A)] 


and 


S = 


(3ma  + m ) R - g(A)  + [4(7^  + 2(m^  + I3m  + B^)  o]  p - [4(7^  + 2(m^  + Am  + A^)  a]  q 

f [g(B)  -g(A)] 


where 

C = 1/'/2¥ 

p - 2a  exp  [(2Bm  + P^)/2a^\ 
q - 2ct  exp[(2Am  + B^)/2ct^] 

and 

R = exp[(m^  + A^  + B^/2ff^]  . 

Letting  D = g(B)  — g(A),  these  moments  can  be  written  as 

3 \D  + bi,P  + 3^ 

^ ■ riiTc 

with  a,  , b,  and  c,  given  in  Table  A-1. 

K K K K 

Evaluation  of  these  nine  coefficients  and  substitution  of  S in  the  Pollaczek-Khinchin 
formulas 


N = p + 


and 


2(1  -p) 


3(1  -p) 
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S \ 

^2(1  -p)j 


((3  -2p)  S^l 

“2(1  -p) 


+ p(l  -p) 


with  p = Xs  w’ill  produce  the  first  two  moments  of  the  distribution  of  the  number  of  customers 
in  an  M/BN(m,  a.  A,  B)/ 1 queueing  system. 
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acknowledgment,  upon  the  service  time  of  the  computer  communications  network.  These  effects  of  er- 
ror have  generally  been  ignored  in  previous  investigations. 

in  order  to  realistically  account  for  the  service  time  distributions  found  in  actual  computing  systems, 
which  have  bounded  domains,  an  analysis  of  M/G/1  queueing  servers  with  service  times  of  this  sort  was 
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